-
-
Notifications
You must be signed in to change notification settings - Fork 10
Engine optimization PR guidelines #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great start!
I think my most general feedback addresses the structure. You're pretty close to what I'd do myself, but there are some small things I'd change (going by what the user would want to / need to read):
- Intro (explain our general mind-set, I think you nailed that already)
- Choosing what to optimize (going into profiling, bottlenecks, exposed functions, and performance problems reported in issues. For me, this includes the "best / worst case" / "happy / sad path" info)
- Optimizing the code (going into the optimization workflow. We can probably do this mostly by adding links to resources, but we should explain the general mindset and tools you have directly here. e.g. benchmarks, low-level vs high-level, etc.
- Making a pull request (e.g. explaining that the thought process and tests should be documented, and that we need to see numbers vs
master
) - Detailed / domain aware info (e.g. the gpu stuff)
Alright, I updated this with the review comments and to address some of the ordering suggestions from @Ivorforce's review. I also fleshed out the process a bit more. |
Just to mention the general optimization section in the regular documentation (https://docs.godotengine.org/en/4.5/tutorials/performance/general_optimization.html#performant-design) does contain some guidance on low level / high level optimization and we should probably link to this where possible and try and avoid too much repetition of similar info in the contributor docs. Although yes this could be done with a sentence reminding to ensure to consider high level optimizations before getting carried away with low level. On the flip side, high level optimizations are usually more prone to regressions, but overall they should be preferred in the long run. |
Actually I've realised we should probably add a paragraph or two about good / bad benchmarks, pitfalls, random data, hot / cold cache, and optimizers completely optimizing out the benchmark. Can have a look later if none of you guys get to it. Update: |
I would avoid going too much into detail, since you could write multiple books about optimization practices. But two or three paragraphs is probably a good idea. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The improvements are great. Looking very promising!
Instructions on using some common profilers with Godot can be found `here | ||
<https://docs.godotengine.org/en/stable/engine_details/development/debugging/using_cpp_profilers.html>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put this, along with a broad overview of strengths / weaknesses, in a new section called "Tools for optimization". This would include:
- Profilers
- Benchmarks
- Assembly viewer / godbolt (with a link to Agner Fog's resources)
- A milliseconds per frame counter
- Maybe something else I forgot :D
Instructions on using some common profilers with Godot can be found `here | |
<https://docs.godotengine.org/en/stable/engine_details/development/debugging/using_cpp_profilers.html>`_. |
engine/guidelines/optimization.rst
Outdated
- Explain why you chose to optimize this code (e.g. include the profiling result, link the issue report, etc.). | ||
- Show that you improved the code either by profiling again, or running systematic benchmarks. | ||
- Test on multiple platforms where appropriate, especially mobile. | ||
- When micro-optimizing, show assembly before / after to confirm how the optimization is working. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather put this in the "tools" section mentioned above. Assembly is difficult to wrap your head around, and I wouldn't consider it a requirement if the code is provably way faster than before.
- When micro-optimizing, show assembly before / after to confirm how the optimization is working. |
040a70c
to
5551c25
Compare
Have done another pass, may have accidentally squashed the commits, but don't think that matters too much as this is a group PR anyway. Have left some of @Ivorforce suggestions for now for breaking out, maybe we could do that as a followup after merging initial version? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed @lawnjelly. I think this is great for a first version. I'm happy to create a follow-up PR with a tools section.
Often in optimization there can be situations where optimizing for a rare case slows a | ||
common case, or vice versa. Be aware that surprisingly often in games, optimizing for | ||
the worst case can be more important than optimizing for the best case, as worst cases | ||
can cause dropped frames. | ||
|
||
In situations where a PR is trading off speed in different paths, reviewers may have to | ||
decide whether a change is worth making. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good opportunity to mention the value of 1% lows and 0.1% lows in framerate reporting. Unfortunately, Godot lacks built-in metrics for this (godotengine/godot#67136 implements the former), so you'll have to use third-party utilities like MangoHud or Special K to get them.
A practical example is that it's better to have a 70 FPS average and a 60 FPS 1% low rather than a 150 FPS average and a 30 FPS 1% low. You can cap to 60 FPS and have a great experience in the former scenario, but not in the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems more relevant to profiling games than to optimize Godot with a targeted PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I recall correctly there's a profiling article where this could be mentioned (if it's appropriate, I am not sure right now), and then referenced here to avoid duplicate information.
Even more so than for CPU work, it is essential to test GPU changes on mobile as well as | ||
desktop, and the more platforms, the better. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would mention the usual categories of GPUs to optimize for here:
- Dedicated GPUs, with dedicated memory.
- Integrated GPUs (shared memory = memory bandwidth is often the bottleneck; minimize texture reads).
- Mobile and Apple GPUs (TBDR = many desktop-friendly algorithms run slower than usual. Also has shared memory).
Also, I'd emphasize that the Mobile renderer is meant to run on mobile GPUs first and foremost, and that performance on desktop/laptop GPUs (including integrated graphics) is a byproduct rather than its primary goal. On the other hand, Compatibility is meant to perform well on all kinds of devices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go with lawnjelly's suggestion to merge the PR as is, and add this information in a dedicated follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great start!
Aside of stylistic changes, I've added some comments to point out where the page could be extended, but this can be left to a future PR (which I can work on).
5551c25
to
23e7b57
Compare
The article looks great. I've integrated @Calinou's suggestions in a final push. The rest of the comments seem to be focused around adding / improving the information rather than major structural changes, so I think this PR is good to go. I'll go ahead with lawnjelly's suggestion to merge this now, and leave other improvements for follow-up PRs. Thanks again to everyone involved! |
As discussed in core PR meeting a couple of weeks ago, here is a draft of some notes for contributors making optimization PRs.
Feel free to take over and iterate on. 👍